Construction of spoken language model including fillers using filler prediction model

نویسندگان

  • Kengo Ohta
  • Masatoshi Tsuchiya
  • Seiichi Nakagawa
چکیده

This paper proposes a novel method to construct a spoken language model including fillers from a corpus including no fillers using a filler prediction model. It consists of two submodels: a filler insertion model which predicts places where fillers should be inserted, and a filler selection model which predicts appropriate fillers for given places. It converts a corpus that covers domain-relevant topics but includes no fillers into a corpus that contains fillers as well as domain-relevant topics. The experiment against the corpus of spontaneous Japanese shows that language models constructed by the proposed method achieve quite near performance of the traditional trigram language model constructed from the real spontaneous corpus including fillers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating spoken language model based on filler prediction model in speech recognition

We propose a method that uses a filler prediction model for building a language model that includes fillers from a corpus without fillers. In our method, a filler prediction model is trained from a corpus that does not cover domain-relevant topics. It recovers fillers in inexact transcribed corpora in the target domain, and then a language model that includes fillers is built from the corpora. ...

متن کامل

Pauses following fillers in L1 and L2 German map task dialogues

Fillers and pauses in spoken language indicate hesitations. Filler type (uh vs. um) is believed to signal a minor or major following speech delay in L1. We examined whether advanced speakers of L2 German use pauses following filler type (äh vs. ähm) in the same way as native speakers do. Two Map Task corpora of L1 and L2 were contrasted with respect to speaker role, filler type and the exact ti...

متن کامل

Japanese copula marker works as a filler in spontaneous speech

The Japanese syntactic form desune generally works as a copula marker, but sometimes it also works as a filler in spoken language. We examine the distribution and characteristics of two aspects of desune through spontaneous speech corpora, called ASU and CSJ. We also compare desune with other fillers and indicate the similarities between them.

متن کامل

Modeling molecular transport in composite membranes with tubular fillers

Nanotubes have been shown to possess intriguing mass transport properties, and are being incorporated into polymeric membranes for molecular separations. Although models have been developed to predict the effective permeability and selectivity of composite membranes with non-spherical fillers, they only apply to fillers with isotropic transport properties. However, molecular transport in tubula...

متن کامل

Understanding Chinese Spontaneous Speech - Are Mandarin and Cantonese Very Different?

This paper presents a study of the similarity between Cantonese and Mandarin spoken and written texts. Spontaneous speech in Cantonese consists of colloquial and filler phrases but it’s keywords similar to Mandarin. We use a statistical tool to extract Cantonese phrases from a spontaneous speech database. We collected using a Wizard-of-Oz setup. More fillers are collected from written Cantonese...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007